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SUMMARY 


We  consider  a  heteroscedastic  linear  regression  model  with  replication. 
To  estimate  the  variances,  one  can  use  the  sample  variances  or  the  sample 
average  squared  errors  from  a  regression  fit.  We  study  the  large  sample 
properties  of  these  weighted  least  squares  estimates  with  estimated  weights 
when  the  number  of  replicates  is  small.  The  estimates  are  generally 
inconsistent  for  asymmetrically  distributed  data.  If  sample  variances  are 
used  based  on  m  replicates,  the  weighted  least  squares  estimates  are 
inconsistent  for  m  =  2  replicates  even  when  the  data  are  normally 
distributed.  With  between  3  and  5  replicates,  the  rates  of  convergence  are 
slower  than  the  usual  square  root  of  N.  With  m  >  6  replicates,  the  effect  of 
estimating  the  weights  is  to  increase  variances  by  (m-5)/(m-3),  relative  to 
weighted  least  squares  estimates  with  known  weights. 


V'  v.  *a.  a. 


Section  1  :  Introduction 

Consider  a  heteroscedastic  linear  regression  model  with  replication: 


yij  =  xi  &  +  aiei j  (i=1 . N)-  fJ=1 . m)- 


(1.1) 


In  model  (1.1),  p  is  a  vector  with  p-components ,  the  e„  are  independent  and 
identically  distributed  mean  zero  random  variables  with  variance  one.  The 
heteroscedasticity  in  the  model  is  governed  by  the  unknown  cn  .  We  have  taken 
the  number  of  replicates  at  each  to  be  the  constant  m  primarily  as  a 
matter  of  convenience.  In  practice,  it  is  fairly  common  that  the  number  of 
design  vectors  N  is  large  while  the  number  of  replicates  m  is  small.  Our 
intention  is  to  construct  an  asymptotic  theory  in  this  situation  for  weighted 
least  squares  estimates  with  estimated  weights. 

As  a  benchmark,  let  be  the  weighted  least  squares  estimate  with 

2 

weights  1/Oy  Of  course,  since  the  a ^  are  unknown  this  estimate  cannot  be 
calculated  from  data.  If  m  is  fixed  and 

-1  ^  T  2 

Sun  o  =  plim  N  I  x.x./a., 

WLS  *  .  ,  t  t  l 

i=l 


l/o  ~  -1 

(Nm)  (Pun  q  ~  P)  =*Normal(0,  Sm  Q)  . 


(12) 


One  common  method  for  estimating  weights  uses  the  inverses  of  the  sample 


variances , 


2  2  /  .  \  ~1  ^  r  ”  w 

CTn  =  si  =  (“-1)  fj  (ytj  -  yi> 


(1.3) 


The  resulting  weighted  least  squares  estimator  will  be  denoted  by  Pgy 


This  method  is  particularly  convenient  because  it  involves  sending  only 
the  estimated  weights  to  a  computer  program  with  a  weighting  option.  The 

A 

obvious  question  is  whether  Pgy  is  any  good,  and  whether  the  inferences  made 
by  the  computer  program  have  any  reliability.  In  Sections  3  and  4,  we  answer 
both  questions  in  the  negative,  at  least  for  normally  distributed  data  with 


Oft 


g 

S 


* 


I 
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less  than  10  replicates  at  each  x.  In  many  applied  fields  this  is  already 
folklore  (Garden,  et  al . ,  1980).  Yates  and  Cochran  (1938)  also  have  a  nice 
discussion  of  the  problems  with  using  the  sample  variances  to  estimate  the 
weights. 

More  precisely,  for  normally  distributed  data  we  are  able  to  describe 

A 

the  asymptotic  distribution  of  0gy  for  every  m.  For  m  >  6,  this  is  an  easy 

A  A 

moment  calculation  and  we  show  that  0gy  is  more  variable  than  by  a 

factor  (m-3)/(m-5).  The  same  result  was  obtained  by  Cochran  (1937)  for  the 

A 

weighted  mean.  Not  only  is  0gy  inefficient,  but  if  one  uses  an  ordinary 

A 

weighted  regression  package  to  compute  Pgy.  the  standard  errors  from  the 
package  will  be  too  small  by  a  factor  exceeding  20%  unless  m  >  10.  For 
example,  if  one  uses  m  =  6  replicates,  the  efficiency  with  respect  to 
weighted  least  squares  with  known  weights  is  only  1/3,  and  all  estimated 
standard  errors  should  be  multiplied  by  /3  =  1.732.  For  m  <  5,  we  use  the 
theory  of  stable  laws  and  Cline  (1986a, b)  to  describe  the  asymptotic 
distributions.  Perhaps  the  most  interesting  result  here  is  that  if  only 
duplicates  (m  =  2)  are  used,  weighted  least  squares  with  estimated  weights  is 
not  even  consistent.  The  results  are  outlined  in  Table  1. 


TABLE  1  AFTER  THIS  POINT 


A  second  method  for  estimating  weights  is  to  use  the  linear  structure  of 
the  means.  Write  P^  for  the  unweighted  least  squares  estimate  and  define  the 


average  squared  error  estimate  by 

~2  ~2  ~  ,  -1  ”  ,  T  £  .2 

°i2  =  ai2(*U  =  m  <yij  ~  \  ' 

/v 

The  resulting  weighted  least  squares  estimate  will  be  denoted  by  fL 


(1.4) 


A  third  method  is  the  normal  theory  maximum  likelihood  estimate  /? 


(1.5) 


which  is  a  weighted  least  squares  estimate  with  weights  the  inverse  of 

°13  =  <’l2(PML)' 

This  can  be  thought  of  as  an  iterated  version  of  PF[  . 

These  methods  liave  been  discussed  in  the  literature  for  normally 

distributed  errors.  Bement  and  Williams  (1969)  use  (1.3),  and  construct 

approximations  (as  m  -»  <»)  for  the  exact  covariance  matrix  of  the  resulting 

weighted  least  squares  estimate.  They  do  not  discuss  asymptotic 

distributions  as  N  -*  00  with  m  fixed.  Fuller  &  Rao  (1978)  use  (1.4)  while 

Cochran  (1937)  and  Neyman  &  Scott  (1948)  use  (1.5).  Both  find  limiting 

distributions  as  N  -*  00  for  fixed  m  >  3,  although  the  latter  two  papers 

T 

consider  only  the  case  that  x  p  =  p. 

/V  A 

One  striking  result  concerns  consistency.  The  estimates  Pgy.  PpT  and 

A 

Pml  are  always  consistent  for  symmetrically  distributed  errors  but  generally 
not  otherwise'-  see  Theorems  1  and  3.  In  Section  5,  we  compute  the  limit 

A  A 

distributions  of  Pel  and  Pj^-  The  relative  efficiency  of  the  two  is 
contrasted  ir.  the  normal  case  for  m  >  3,  as  follows. 

Remark  1  :  If  ordinary  least  squares  is  less  than  3  times  more  variable  than 

A 

weighted  least  squares  with  known  weights,  then  is  more  efficient  than 

maximum  likelihood. 

Remark  2  :  If  ordinary  least  squares  is  more  than  5  times  more  variable  than 
weighted  least  squares  with  known  weights,  then  maximum  likelihood  is  more 
efficient. 

Further,  for  normally  distributed  data,  maximum  likelihood  is  more  variable 
than  weighted  least  squares  with  known  weights  by  a  factor  m/(m- 2).  This 
means  a  tripling  of  variance  for  m  =  3  even  when  using  maximum  likelihood. 


We  will  assume  throughout  that  (x^.  a are  independent  and  identically 
distributed  bounded  random  vectors,  independently  distributed  of  the  { e  }. 

A  A 

We  define  z^  =  xVct^  and  d ^  =  aVa^.  For  any  weighted  least  squares 

/v  ~2 

estimator  with  estimated  weights  m.  =  1/ct^ , 


~  f  _i  w  7  '-o]  1  -1  ^  —  ~2 

p  -  p  =  N  2  z.zf  /df  N  2  z . e . /df 
L  l=l  t  t  ij  .=1  L  L  L 


(2.1) 


Assuming  they  exist,  we  note  that  the  asymptotic  covariance  of  the  weighted 


and  unweighted  least  squares  estimators  are,  respectively, 

S^c  =  {E(zzT)}_1 


S'1  =  (E(xxT)}  1  E(ct2xxT)  (E(xxT)}  1. 


(2-2) 


(2.3) 


Section  3  :  Weighting  with  Sample  Variances 
In  this  section,  we  describe  consistency  and  asymptotic  normality  for 

A 

weighted  least  squares  estimates  with  the  weights  being  the  inverse  of 

sample  variances.  We  first  describe  the  general  case  assuming  that 
sufficient  moments  exist.  We  then  look  more  closely  at  the  case  of  normally 
distributed  observations.  In  this  setup, 


d  =  (m-1)  1  2  (£..-£.)  . 

i  j.=l  ij  t' 

Define  Vjk  =  E  (ef/df1)  and  vjk  =  E(  |i.  |  j/d^) . 


The  first  result  indicates  that  we  obtain  consistency  only  when 

”11  -  -  °- 


(3.1) 


This  is  true  for  symmetrically  distributed  data,  but  generally  not  otherwise. 


tV»y 


THEOREM  1  : 


(a)  If  Ujj  <  00  and  <  ®,  then 


plim  ^  =  P  +  (tj01  ^VLg)  T,n  E(z) , 


so  that  consistency  holds  only  if  E{z)  =  0  or  (3.1)  holds. 


(b)  If  vjk  <  oo  for  j  <  2.  k  <  2  and  (3.1)  holds,  then  (Nm)  (0sy  -  0) 


2  -1 

is  asymptotically  Normal{0,  m(p^/VQ^)  , 


Proof  of  Theorem  1  :  This  follows  from  the  weak  law  of  large  numbers  and  the 


central  limit  theorem.  0 


For  normally  distributed  observations,  the  assumption  that  v ^  <  ro  for 


j.  k  <  2  holds  only  if  there  are  at  least  6  replicates.  In  this  case,  we 


have  the  following  corollary. 


COROLLARY  1  :  Assume  that  the  errors  are  normally  distributed.  For  m  > 


1/2  ~  -1 
6.  (Nm)4'*  (pw  -  0)  is  asymptotically  Normal (0,  (m-3)/(m-5)  Sm  c) . 


Comparing  with  (1.2),  we  see  that  the  effect  of  using  m  >  6  replicates 


to  estimate  sample  variances  causes  an  inflation  of  variance  by  the  factor 


(m-3)/(m-5)  over  weighted  least  squares  with  known  weights.  Even  with  m  = 


10.  this  results  in  a  40%  increase  in  variance. 


If  one  uses  a  standard  statistical  package  with  weights  1/s^ ,  then  the 


resulting  standard  errors  will  also  be  asymptotically  incorrect.  Such 


packages  estimate  the  asymptotic  covariance  matrix  of  (Nm)  (0  -  0)  by 


CTSV  SWLS,  Where 


N  m 


T  ~  -.2,2 


ct  =  (Nm-p)  I  I  (y  -  x  0  )  /s  ; 

bV  i=l  j= 1  LJ  1  SV  1 


5  v  WW.  WW  V\  V  «.-  '*.-  f- k.-  ^.WX^WTTntyi 

& 

fi: 
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~  _  I  ^  „ 

S™  e  =  N  2  x .  x !  /  s .  . 

WLS  t  i  t 

~2 

If  m  £  6  and  if  the  data  are  normally  distributed,  then  a gy  converges  in 

Art  A 

probability  to  E(1AL)  =  (m-l)/(m-3) ,  while  (S^  ^  -  (m-l)/(m-3)  S^g)  -*•  0. 

Asymptotically,  therefore,  standard  errors 


Thus,  (CTgv  SVLS)  -*  0 


1/2. 


should  be  multiplied  by  {(m-3)/(ni-5)}  :  see  Table  1. 


t:-4. 

v?v, 

*\V 

A' 


%  % 
•  J*. 

•A 

'■V, 


Section  4  :  Sample  Variances  With  m  <  5  Replicates  in  the  Normal  Case 

In  this  section,  we  consider  normally  distributed  data  with  m  <  5 

replicates  and  the  weights  being  the  inverses  of  the  sample  variances.  Here 

—  ^2 

Theorem  1  does  not  apply  since  eVd^  does  not  have  finite  variance.  The 

results  here  are  based  on  the  work  of  Cline  (1986).  We  first  state  a  general 

result  which  may  be  of  independent  interest.  The  results  for  weighted  least 

squares,  assuming  normal  errors,  are  then  derived  as  a  corollary. 

First,  a  few  definitions  are  required.  A  positive  function  p  is 

regularly  varying  with  exponent  p,  denoted  by  p  €  RV(p),  if 

p(yt)/p(t)  -»  yP  as  t  “  for  all  y  >  0. 

Let  (z^,  ,  un)  be  independent  and  identically  distributed  random 

variables  with  z^  €  1RP  independent  of  (u^ ,  un) ,  with  a  symmetric 

distribution  and  un  >  0.  Define  p^(t)  =  E{m  I(u>  <  t)}  and  p^ft)  = 
2 

E{(uiu)  I(uu>  £  t)}.  Let  (cjjy  c2jy)  be  constants  satisfying,  as  IV  °°, 

N  PjCc^) /c1N  -*  1  and  N  P^c^)/ -  1. 

If  <  1,  then  =  S^(aj)  will  denote  a  positive  stable  random 

variable  with  Laplace  transform 

“l 

E{exp(-tSj ) }  =  exp{-T(2-a1)  t  /a^} . 

If  <ij  =  1  then  =  1  almost  surely.  We  will  denote  by  S ^  =  S^(a^)  a 
symmetric  stable  random  variable  with  characteristic  function 


%*  > 

*  V* 

'0 


& 


191 


►  ’ 

VS 
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E{exp(itS2)}  =  exp|-f(3-a2)  cos(7ra2/2)  |t|  2/(a2( l-a2))|. 


Of  course,  if  a2  =  2  then  S2  is  standard  normal. 


THEOREM  2  :  Assume  that  ^  €  RVfl-a^.  P2  €  RV(2-a2)  and  that 

r  -i  N  -i  N  i 
[C1JV  ,2  V  C2N  ,2  “1*1 J 

is  asymptotically  distributed  as  js^(a^),  S2(a2)j.  Suppose  that  for  some  6  > 


0  and  all  i  ,  j , 


E(|z„P)  <  00  for  y  =  min(2,  max(2a^,a2)  +  6), 


Then  there  exists  Yj,  Y2  (Y2  e  IRP,  Yj  pxp  positive  definite )  such  that 

t  rls 

b«  -  (ci/c2»)  vi  ”t  J 

—1  -p  ’Y 

is  asymptotically  distributed  as  Y^  Yg.  Further,  for  any  b  e  IR  ,  b  Y^b  and 


b  Y2  haue  the  same  distributions ,  respectively ,  as 


{e[|^|2“1]}1/“‘  «,  and  Mi^r2]}""2 


l/ar 


Remark  3  :  In  Theorem  2,  Y^  and  Y2  are  not  necessaiily  independent  unless  a^ 

T 

=  1  or  a2  =  2.  In  the  former  case,  Y^  =  E(zz  )  almost  surely,  while  in  the 

T 

latter  case  Y2  is  normally  distributed  with  mean  zero  and  covariance  E(zz  ). 


Proof  of  Theorem  2  :  Consider  first  the  case  a^  <  1 .  From  Theorem  1  of  Cline 


(19S6)  we  get 


f  -1  N  T  -l  ^  ) 

[cl«  .Vtzi  V  C2N  -ViVt  J 

*■  1  =  1  1  =  1  J 


is  asymptotically  distributed  as  (Y^.Yg).  In  the  case  that  =  1 ,  then  Sj  = 
1  almost  surely  by  Feller  (1971,  p.  236).  From  unpublished  work  of  Cline  and 


»r 
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from  Gnedenko  &  Kolmogorov  ((1954),  p.  134),  for  each  ( j  ,h ), 

-1  N 

pHm  c  2  z ijZ.h  ®.  =  E(z  zlfe). 
i=l 

The  joint  convergence  of  the  remaining  terms,  ^ziuiwi '  aSa*n  follows 

from  Theorem  1  of  Cline  (1986) 

In  either  case,  convergence  of  the  ratio  follows.  The  limiting  joint 

distribution  is  difficult  to  describe,  but  the  stated  marginal  distributions 
T  T 

of  b  Yjb  and  b  Y ^  can  be  inferred  from  Proposition  3  of  Breiman  (1965)  and 
Theorem  3  of  Mailer  (1981).  One  may  also  conclude  that  Y^  and  Y ^  are 
independent  if  =  1,  since  then  Y^  is  degenerate.  Also,  Y^  and  Y^  are 
independent  if  a ^  =  2,  since  then  Y ^  is  Gaussian,  and  for  such  limits  the 
ncn-Gaussian  stable  component  is  always  independent  of  the  Gaussian  component 
(c . f . ,  Sharpe,  1969).  □ 


Special  Cases  :  If  (t/a^)  Pr(u>  >  t)  -»  1  for  >  0,  then  we  have  the 
following  cases: 

1/cij 

(i)  If  ttj<1,  then  c ^  =  a^  {N/(l-ctj)}  ,  and  is  positive 

stable . 

(ii)  If  Tfj=l,  then  a^  =  l,  =  a^  N  log(N),  and  =  1. 

(iii)  If  tt^M,  then  a^  =  l .  =  N  E(m) ,  and  =  1. 


If  (t/ag)  Pr( |um|  >  t)  -»  1  for  >  0,  then  we  have  the  following  cases: 

1/a2 

(i)  If  t^ien  a2=nr2’  C2N  =  a2  is  sYTTimetric 

stable. 

1/9  1/9 

(ii)  If  i0=2,  then  a?= 2,  c ^  =  2  a0  N  log(N),  and  SQ  is  Normal. 


(iii)  If  t^ien  a2=^’  C2N  =  Je|uu>|^ 


1/2 


,  and  S£  is  Normal. 


-  9  - 


Consider  the  case  of  normally  distributed  errors  in  model  (1.1),  where 

T 

make  the  identifications  z.  =  x./o. ,  with  E(zz  )  =  Further,  write 

1  L  t  WLo 

-l  m 

u.  =  e.  =  m  2  a .  . 

t  t  j=1  tj 


Of  course,  and  un  are  independent  and  E(u^)  <  00 .  Set  a  =  (m-l)/2,  a  =  a 
(r(l+a)}‘1/a  and  b  =  (2/t m)1/2  {T((l+a)/2)}1/a.  Then  ( t/a)a  Pr(w  >  t)  -»  1, 
(t/ab)a  Pr(|uio|  >  t)  -*  1  and  if  a  >  1.  E (w)  =  a/(a- 1).  With  the  indicated 
choices  of  c^  and  c^,  Theorem  1  of  Cline  (19S6a)  shows  that  (4.1)  holds. 
Thus  the  conditions  of  Theorem  2  are  met. 


COROLLARY  2  '•  In  the  normally  distributed  case,  with  ,  S^,  Y^,  Y^  as 
defined  in  Theorem  2,  we  have  the  following  cases-' 


Case  1  (m  =  2)  :  “i  =  a2  =  aru^ 

(Psv  -  «  asymptotically  distributed  as  T^(3/4)/(9rr)  Y^  *  Y^. 

Case  2  (m  =  3)  :  a^  -  a^-  1 ,  and 

~  1/2  -1 
log (N)  (Pgy  -  P)  Is  asymptotically  distributed  as  (2/(3ir)}  Y^  Y^. 

Case  3  (m  =  4)  :  a  ^  =  1 ,  a2  =  3/2  • 

1/3  ~  -1/2  ?  1/3 

N  (Pgy  ~  ^  asy^Pt°ticolly  distributed  as  2  (T  ( l/4)/(  18tt)  } 

Y  1  Y 
1  2 

Case  4  (m  =  5)  :  =  1 ,  ag  =  2  and 

1/2  ~  -1/0-1 
N  /log(N)  (Pgy  P)  Is  asymptotically  distributed  as  5  Yj  Y 

Case  5  (m  >  6)  :  Covered  by  Corollary  1  already.  □ 


Proof  of  Corollary  2  :  In  the  notation  of  Theorem  2 ‘  bN  ~  (C1N/CW  ^SV  ^ 
is  asymptotically  distributed  as  Y^  ^  Yg-  Thus,  in  each  case  it  suffices  to 
construct  the  constants  (c^.  c^)  • 

(m  =  21  :  Here  a  =  1/2,  c1N  =  (8/ir)  N2  and  =  {8  r2(3/4)/(97r2)}  N2. 

1/2 

(m  =  31  :  Here  a  =  1,  c^  =  N  log(N)  and  =  {2/(3tt)}  N. 

fn>  =  4)  :  Here  a  =  3/2,  c1N  =  3  N  and  =  2~1/2  {3  I^(  1/4)/(2tt)}1/3  N2/3. 

(m  =  51  :  Here  a  =  2,  c]fl  =  2  N  and  =  2  5~1/2  fl1/2  log(N)  .  □ 


Section  5  :  Estimating  Variances  b 


le  Average  Squared  Errors 


One  might  reasonably  conjecture  that  making  use  of  the  known  linear 
structure  for  the  means  results  in  improvements  over  using  only  sample 
variances.  We  will  show  that  this  is  the  case,  at  least  for  normally 

A 

distributed  data.  Let  Pq  be  any  estimate  of  P,  and  define 

*o  *  _i  m  T  ~  2 

a2(0o)  =  m  ^  (ytj  -  P0) 

A  A  A  A  A 

and  d^(p^)  =  o^(Pq)/o^.  We  denote  by  P ^  the  weighted  estimate  with  the 

A  p  A  A  A 

estimated  weights  l/a^(P^).  As  defined  in  the  introduction,  p^  uses  Pq  = 

A  A  A  A 

P the  ordinary  unweighted  least  squares  estimate,  and  P ^  used  P q  =  P^. 

A 

Our  results  here  rely  on  the  consistency  of  p^,  and  two  other  reasonable 


moment  conditions  for  m  large  enough.  Here  are  the  assumptions. 

plim  PQ  =  p. 

For  each  >  0,  there  exists  >  0  such  that 


e{  sup  K2(Pm)  -  d  2(P)|}  <  c 

MIR  1  1  >  1 


MIPM-Pll<c2 


and  such  that 


eC"«c2  i?1' 

In  addition,  we  assume  the  finite  existence  of 


(5.1) 


(5.2) 


(5.3) 


s’A.V.VA' 


71  jk  =  E{£i/df  (P)}l  Vjk  =  E {kj'/dpp)}  <  »  for  j.k  <  2.  (5.4) 


The  first  result  describes  the  consistency  of  Pq. 


THEOREM  3  :  Assume  (5.1)  -  (5.4).  Then 


plim  PQ  =  p  +  T7n(T|01  S^)  E(z). 


(5.5) 


Thus,  Pg  is  consistent  only  if  =  0  or  E(z)  =  0.  Further,  as  N  -*  «, 

N1/2  (PG  -  p)  =  A^1  [bw  +  CN  N1/2  (PQ  -  «].  (5.6) 

-1/2  —  ~2 

uhere  plim  A^  =  t7q1  S^,  b N  =  N  2  z.et/d.(P)  and  plim  =  2  q22  S^.  □ 


Proof  of  Theorem  3  :  Since  the  {z^}  are  bounded,  the  assumptions  make 
possible  the  usual  Taylor’s  series  argument  leading  to  (5.6)  and  (5.5)  is  an 
immediate  consequence  of  (5.6).  D 


Assuming  consistency  of  the  maximum  likelihood  estimator,  we  can  compute 


the  limit  distributions  of  and  P ^ 


THEOREM  4  :  Make  the  assumptions  of  Theorem  3,  mith  the  (e_)  being 

symmetrically  distributed. 

T  2  T 

(a)  Let  Vj  =  E(xx  )  and  Vg  =  E(o  xx  )  be  finite  and  positive  definite . 
-1  -1  -1 

Let  V2  V2  .  Then  with  Pq  chosen  as  the  unweighted  least  squares 

1/2  ~  -1 
estimate,  ( Nm )  (Ppj  ~  p)  is  asymptotically  Normal(O.S^) ,  where 

SEL  =  m(r,22/rio0  +  4  ^21  ^  SWLS  +  4  ^22  SL  ^  ' 

A  A 

(b)  With  p0  =  Pml.  the  maximum  likelihood  estimate  t  then 

1/2  ^  -1 
(Nm)  (P^  -  P)  is  asymptotically  Normal(0,  S^) ,  where 

SML  =  m  V22  (T701  ~  2t722^  SWLS’  D 


•.SS'WnVW  ■/  %*  •.*  ^  •.*  %/  «.*  «.*  %•  •/ 

u  *r ». 


^  N  -*  *  % 


Proof  of  Theorem  4  '■  Parts  (a)  and  (b)  follow  easily  from  Theorem  3, 

1/2 

Slutsky's  theorem  and  the  fact  that  (Nm)  (P^  -  f3 )  is  asymptotically 
Normal (0,S^)  .  .  0 


COROLLARY  3  :  For  normally  distributed  observations  with  m  >  3, 

1/2  ~  1/2  ~ 

(Nm)  -  P)  and.  (Nm)  (PFr  -  P)  are  asymptotically  normally 

distributed  u>i th  respective  covariances 

(m/(m-2)}  S jjjg  and  {(1  +  2m"1  -  8m"2)  S jjjg  +  4m"2  S^1}. 


Proof  of  Corollary  3  :  By  direct  calculation  as  in  Fuller  &.  Rao  (1978), 
17  -  =  m/(m-2),  17  =  1/m  and  t?00  =  l/(m-2) .  □ 


As  noted  by  Fuller  &  Rao  (1978),  the  asymptotic  covariance  of  p^ 

consists  of  a  mixture  of  the  weighted  least  squares  covariance  and  the 

-1  ~ 

unweighted  least  squares  covariance  .  Comparing  PFJ  with  the  maximum 

_  j  _] 

likelihood  estimate  p„.  depends  on  how  much  bigger  S,  is  than  Smc. 

i*lL<  WLo 

Detailed  calculations  verify  Remarks  1  and  2  of  the  introduction.  Thus, 
doing  iterative  weighted  least  squares  may  actually  hurt,  unless  the  starting 
value  P^  is  sufficiently  bad. 


Section  6  1  Discussion 


Our  results  can  be  summarized  as  follows  : 


(a)  If  nothing  is  known  about  the  structure  of  the  sample  variances, 
then  none  of  the  common  weighted  estimates  can  be  assumed  to  be  consistent 
for  data  from  an  asymmetric  distribution. 


(b)  Using  sample  variances  as  a  basis  for  estimating  weights  is 
inefficient  unless  the  number  of  replicates  m  is  fairly  large,  e.g. ,  m  >  10. 

(c)  Using  sample  average  squared  errors  from  a  preliminary  fit  to  the 
regression  function  as  a  basis  for  estimating  weights  is  typically  more 
efficient  than  using  sample  variances.  However,  even  here  a  fair  number  of 
replicates  is  helpful.  For  example,  the  maximum  likelihood  estimate  for 
normally  distributed  data  based  on  6  replicates  still  has  standard  errors 
approximately  20%  larger  than  ordinary  weighted  least  squares  theory  would 
suggest. 

There  are  at  least  two  alternative  methods  for  estimating  the  weights. 
The  first  is  to  model  the  variances  parametrically,  e.g., 

°i  =  CT(Xi  ^  ’ 

See  Carroll  &  Ruppert  (1987)  and  Davidian  &  Carroll  (1987).  The  second  is  to 
perform  a  nonparametric  regression  of  (1.3)  and  (1.4)  against  the  predictors 
and  use  this  regression  to  estimate  the  weights  (c.f.  Carroll,  1982). 
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TABLE  1 


A  summary  of  the  results  when  the  weights  are  the  inverses  of  sample 
variances  based  on  m  replicates.  The  relative  efficiency  is  calculated  with 
respect  to  weighted  least  squares  with  known  weights.  The  column  labelled 
"Standard  Error  Factor"  is  the  number  one  should  multiply  standard  errors 
from  a  weighted  least  squares  package  by  to  obtain  asymptotically  correct 
standard  errors. 


Asymptotically  Rate  of  Relative  Standard  Error 

(m)  Consistent?  Normal?  convergence  Efficiency  Factor 


